Model Selection

Self-supervised Pretraining

# Self-supervised Pretraining

Resencl OpenMind SimCLR

The first comprehensive benchmark study model for self-supervised learning on 3D medical imaging data

Resencl OpenMind VoCo

The first comprehensive benchmark study model for self-supervised learning on 3D medical imaging data

Hubert Ecg Large

A self-supervised foundational model for broadly scalable cardiac applications, trained on 9.1 million 12-lead ECGs covering 164 cardiovascular diseases

Molecular Model

BERTurk-Legal is a Transformer-based language model specifically designed for prior case retrieval tasks in the Turkish legal domain.

Large Language Model

Transformers Other

Molformer XL Both 10pct

MoLFormer is a chemical language model pre-trained on 1.1 billion molecular SMILES strings from ZINC and PubChem. This version uses 10% samples from each dataset for training.

Molecular Model

Videomae Small Finetuned Ssv2

VideoMAE is a self-supervised pretrained video model based on Masked Autoencoder (MAE), fine-tuned on the Something-Something V2 dataset for video classification tasks.

Video Processing

Regnety 640.seer

RegNetY-64GF feature/backbone model, pretrained using the SEER method on 2 billion random internet images with self-supervised learning

Image Classification

Vision Transformer model pretrained using MSN method, excels in few-shot scenarios

Image Classification

This vision transformer model is pretrained using the MSN method and is suitable for few-shot learning scenarios, particularly for image classification tasks.

Image Classification

Videomae Base Short Ssv2

VideoMAE is a self-supervised pretraining model for videos based on Masked Autoencoder (MAE), pretrained for 800 epochs on the Something-Something-v2 dataset.

Video Processing

Dit Large Finetuned Rvlcdip

Document image classification model pretrained on IIT-CDIP and fine-tuned on RVL-CDIP, using Transformer architecture

Image Classification

Dit Base Finetuned Rvlcdip

DiT is a Transformer-based document image classification model, pretrained on the IIT-CDIP dataset and fine-tuned on the RVL-CDIP dataset

Image Classification

Beit Base Patch16 384

BEiT is a vision Transformer-based image classification model pretrained in a self-supervised manner on ImageNet-21k and fine-tuned on ImageNet-1k.

Image Classification

Beit Large Patch16 384

BEiT is a vision Transformer-based image classification model, pretrained in a self-supervised manner on ImageNet-21k and fine-tuned on ImageNet-1k.

Image Classification

Wavlm Base Plus

WavLM is a large-scale self-supervised pretrained speech model developed by Microsoft, pretrained on 16kHz sampled speech audio, suitable for various speech processing tasks.

Speech Recognition

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase